IIIIIIIIIIIIII19IIII1IIII1I1HIII 

US006247141B1 

(i2) United States Patent (io> Patent No.: US 6,247,141 bi 

Holraberg (45) Date of Patent: Jun. 12, 2001 



(54) PROTOCOL FOR PROVIDING REPLICATED 
SERVERS IN A CLIENT-SERVER SYSTEM 

(75) Inventor: Per Anders Holm berg, Stockholm (SE) 

(73) Assignee: Telefonaktiebolaget LM Ericsson 
(publ), Stockholm (SE) 

( * ) Notice: Subject to any disclaimer, the term of this 
patent is extended or adjusted under 35 
U.S.C. 154(b) by 0 days. 

(21) Appi. No.: 09/159,771 

(22) Filed: Sep. 24, 1998 

(51) Int. CI. 7 G06F 11/14; H04L 29/02 

(52) U.S. CI 714/2; 714/4; 707/1; 709/203 

(58) Field of Search 714/2, 4, 43, 56, 

714/15, 20, 3, 48, 758, 807; 707/1, 10, 
204; 709/203, 217, 212, 227, 101, 228, 
219, 216; 711/162; 370/216 

(56) References Cited 

U.S. PATENT DOCUMENTS 

4,879,716 11/1989 McNally ct al. . 

5,005,122 * 4/1991 Griffin et al. . 

5,307,481 4/1994 Shimazaki et a). . 

5,434,994 7/1995 Shaheen et al. . 

5,452,448 9/1995 Sakuraba ct al. . 

5,455,932 10/1995 Major et al. . 

5,488,716 1/1996 Schneider et al. . 

5,513,314 4/1996 Kandasamy et al. . 

5,526,492 6/1996 Ishida . 

5,566,297 10/1996 Devarakonda et al. . 

5,581,753 12/1996 Terry et al. . 

5,634,052 * 5/1997 Morris. 

5,652,908 * 7/1997 Douglas et al. . 

5,673,381 9/1997 Huai et al. . 

5,696,895 12/1997 Hemphill et al. . 

5,751,997 5/1998 Kullick et al. . 

5,796,934 8/1998 Bhanot et al. . 

FOREIGN PATENT DOCUMENTS 
0838758A2 4/1998 (EP) . 



OTHER PUBLICATIONS 

Murthy Devarakonda, et al., "Server Recovery Using Natu- 
rally Replicated State: A Case Study," IBM Thomas J. 
Watson Research Center, Yorktown Hts, NY, IEEE Confer- 
ence on Distributed Computing Systems, pp. 213-220, May 
1995. 

Kenneth P, Birman, "The Process Group Approach to Reli- 
able Distributed Computing", Reliable Distributed Comput- 
ing with the Isis Toolkit, pp. 27-57, ISBN 0-8186-5342-6), 
reprinted from Communications of the ACM, Dec. 1993. 

Robbert Van Renesse, "Causal Controversy at Le Mont 
St-Michel", Reliable Distributed Computing with the Isis 
Toolkit, pp. 58-67, (ISBN 0-8186-5342-6), reprinted from 
ACM Operating Systems Review, Apr. 1993. 

(List continued on next page.) 
Primary Examiner — Gopal C. Ray 

(74) Attorney, Agent, or Firm — Burns, Doane, Swecker & 
Mathis, L.L.P. 



(57) 



ABSTRACT 



A fault-tolerant client-server system has a primary server, a 
backup server; and a client. The client sends a request to the 
primary server, which receives and processes the request, 
including sending the response to the client, independent of 
any backup processing. The response includes the primary 
server state information. The primary server also performs 
backup processing that includes periodically sending the 
primary server state information to the backup server. The 
client receives the response from the primary server, and 
sends the primary server state information to the backup 
server. The primary server state information includes all 
request-reply pairs that the primary server has handled since 
a most recent transmission of primary server state informa- 
tion from the primary server to the backup server. The 
primary server's backup processing may be activated peri- 
odically based on a predetermined time interval. 
Alternatively, it may be activated when the primary server's 
memory for storing the primary server state information is 
filled to a predetermined amount. 

10 Claims, 4 Drawing Sheets 
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PROTOCOL FOR PROVIDING REPLICATED 
SERVERS CM A CLIENT-SERVER SYSTEM 

BACKGROUND 

The invention relates to fault tolerant server systems, and 5 
more particularly to fault tolerant server systems including 
redundant servers. 

High availability of service in a telecommunication sys- 
tem can be achieved by means of fault tolerant computers or 
distributed system architectures. The use of this redundancy, 10 
however, may adversely affect other system properties. For 
example, the utilization of redundancy on the hardware level 
increases cost, physical volume, power dissipation, fault 
rate, and the like. This makes it impossible to use multiple 
levels of redundancy within a system. 15 

For example, distributed systems can incorporate repli- 
cation between computers, in order to increase robustness. If 
each of these computers are fault tolerant, costs will multi- 
ply. Furthermore, if backup copies are kept in software, for 2Q 
the purpose of being able to recover from software faults, the 
cost of the extra memory will multiply with the cost of the 
fault tolerant hardware, and for the multiple copies in the 
distributed system. Thus, in order to keep costs low, it is 
advisable to avoid the use of multiple levels of redundancy. 25 
Since the consequence of such a design choice is that only 
one level of redundancy will be utilized, it should be 
selected so as to cover as many faults and other disturbances 
as possible. 

Disturbances can be caused by hardware faults or soft- 30 
ware faults. Hardware faults may be characterized as either 
permanent or temporary. In each case, such faults may be 
covered by fault-tolerant computers. Given the rapid devel- 
opment of computer hardware, the total number of inte- 
grated circuits and/or devices in a system will continue to 35 
decrease, and each such integrated circuit and device will 
continue to improve in reliability. In total, hardware faults 
are not a dominating cause for system disturbances today, 
and will be even less so in the future. Consequently, it will 
be increasingly more difficult to justify having a separate ^ 
redundancy, namely fault tolerant computers, just to handle 
potential hardware faults. 

The same is not true with respect to software faults. The 
complexity of software continues to increase, and the 
requirement for shorter development time prevents this 45 
increasingly more complex software from being tested in all 
possible configurations, operation modes, and the like. Bet- 
ter test methods can be expected to fully debug normal 
cases. For faults that occur only in very special occasions, 
the so-called "Heisenbuggs", there is no expectation that it 50 
will be either possible or economical to perform a full test. 
Instead, these kinds of faults need to be covered by redun- 
dancy within the system. 

A loosely coupled replication of processes can cover 
almost all hardware and software faults, including the tem- 55 
porary faults. As one example, it was reported in I, Lee and 
R. K. Iyer, "Software Dependability in the Tandem Guardian 
System," IEEE TRANSACTIONS ON SOFTWARE 
ENGINEERING, vol. 21, No. 5, May 1995 that checkpoint- 
ing (i.e., the copying of a present state to a stand-by 60 
computer) and restarting (i.e., starting up execution from a 
last checkpointed state by, for example, reading a log of the 
transactions that have occurred since the last checkpoint and 
then starting to process new ones) covers somewhere 
between 75% and 96% of the software faults, even though 65 
the checkpointing scheme was designed into the system to 
cover hardware faults. The explanation given in the cited 



141 Bl 

2 

report is that software faults that are not identified during test 
are subtle and are triggered by very specific conditions. 
These conditions (e.g., memory state, timing, race 
conditions, etc.) did not reoccur in the backup process after 
it took over; consequently, the software fault does not 
reoccur. 

A problem with replication in a network is that there are 
a few services, such as arbitration of central resources, that 
do not lend themselves to distribution. This type of service 
must be implemented in one process and needs, for perfor- 
mance reasons, to keep its data on its stack and heap. To 
achieve redundancy, this type of process must then be 
replicated within the distributed network. In a high perfor- 
mance telecommunication control system this replication 
must be done with very low overhead and without introduc- 
ing any extra delays. 

SUMMARY 

It is therefore an object of the present invention to provide 
methods and apparatuses for implementing a fault-tolerant 
client-server system. 

In accordance with one aspect of the present invention, 
the foregoing and other objects are achieved in a fault- 
tolerant client-server system that comprises a primary 
server, a backup server and a client. The client sends a 
request to the primary server. The primary server receives 
and processes the request, including sending a response to 
the client, independent of any backup processing being 
performed by the primary server, wherein the response 
includes primary server state information. By sending the 
response independent of backup processing, a higher level 
of concurrence is achieved, thereby making the system more 
efficient. The primary server also performs backup 
processing, including periodically sending the primary 
server state information to the backup server. The client 
receives the response from the primary server, and sends the 
primary server state information from the client to the 
backup processor. 

In another aspect of the invention, the primary server state 
information includes all request-reply pairs that the primary 
server has handled since a most recent transmission of 
primary server state information from the primary server to 
the backup server. 

In yet another aspect of the invention, the primary server 
stores the primary server state information in storage means. 
The act of performing backup processing in the primary 
server may be performed in response to the storage means 
being filled to a predetermined amount. 

In an alternative embodiment, the act of performing 
backup processing in the primary server may be performed 
periodically based on a predetermined time interval. 

BRIEF DESCRIPTION OF THE DRAWINGS 

The objects and advantages of the invention will be 
understood by reading the following detailed description in 
conjunction with the drawings in which: 

FIG. 1 is a block diagram that illustrates the use of 
redundant servers in a client-server application; 

FIG. 2 is a diagram illustrating the message flow in a 
fault -tolerant client-server application; 

FIG. 3 is a diagram illustrating the flow of messages 
between a client, a primary server and a backup server in 
accordance with one aspect of the invention; and 

FIGS. 4a and 4b illustrate an efficiency improvement that 
is accomplished by means of the use of causal ordering in 
communications between processes. 
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DETAILED DESCRIPTION 

The various features of the invention will now be 
described with respect to the figures, in which like parts are 
identified with the same reference characters. 

FIG. 1 is a block diagram that illustrates the use of 
redundant servers in a client-server application. In 
particular, a plurality of client applications, C, are shown. A 
primary server, S 101, runs on a first processor 103. A 
second processor 105, which is separate from the first 
processor 103, runs a backup server, S' 107, in parallel with 
the primary server S 101. Overall, so that when one fails, the 
other can take over without any client application C noticing 
the problem, the primary server S 101 and the backup server 
S f 107 should have the same internal state at a virtual time, ^ 
T, that occurs after processing any specific request from the 
client application C. (Since the backup server S' 107 trails 
the primary server S 101, the backup server S' 107 reaches 
the virtual time later in real time than the primary server S 
101 does.) The existence of replicated server processes 2Q 
should not be visible to the client applications C using the 
server. In order to implement such a strategy, the following 
problems need to be solved: 

Addressing: The client application C should address the 
server in a consistent way, regardless of whether the ^ 
service is being performed by the primary server S 101 or 
the backup server S' 107 (or both). 

Replication and Incoming requests from different client 
applications C, as well 

Synchronization: as fault and repair notifications, can arrive 3Q 
in different order to primary server S 101 and backup 
server S 1 107 due to differences in the physical network 
between processors. However, these requests must be 
sorted in the same order. 

Fault and Repair Server process failure and the start of a new ^ 
server process 

Notifications: must be detected by the server that is still 
working. 

State Transfer: When a server process restarts after a failure, 
the working server must transfer its internal state to the 4Q 
new server before it can start processing requests. 
In addressing the above problems, a preferred embodi- 
ment of the invention attempts to satisfy the following goals: 
Solve the replication problem only once. The implemen- 
tation of replication has many pitfalls and is compli- 45 
cated to verify. There are many possible faults that must 
be covered. 

Add only a low overhead, and impose this only on 
communications to replicated processes. 

Worst case response times during normal operation, in the 50 
case of failure, and also when reintegrating a new 
process should all be known in advance and kept to 
acceptable levels. 

No extra messages should be added to critical timing 
paths. Many conventional implementation techniques 55 
violate this goal. For example, a primary server may 
have to send a message to the secondary server and get 
a reply back before sending a reply back to the client. 
It is desired to avoid this so that the system's real-time 
response times are not slowed down by the added so 
redundancy. 

Handle many clients and dynamic clients. Telecommuni- 
cation applications typically have many possible clients 
for a server. This means that one cannot use algorithms 
that, for example, must update information in the 65 
clients when the server process fails or recovers. Also, 
client processes typically have short lifetimes (they 
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may exist only during a call). This means that algo- 
rithms that require the server to keep track of clients 
cannot be used. 
In order to make the protocol simpler, a preferred embodi- 
ment of the invention imposes several restrictions. Some of 
these restrictions can easily be lifted by making the protocol 
more general. However, their inclusion here facilitates a 
description of the underlying mechanisms involved. These 
restrictions are: 
Only two servers are involved: a primary and a backup. It 
will be apparent to those of ordinary skill in the art that 
the protocol can be extended to include more. 
Tolerance for one fault at a time, that is, a single client or 
server failure. The system must recover (for example 
by starting up a cold stand-by) before another fault can 
be tolerated. 

Simple network configurations. Comp heated network 
fault cases that, for example, split the network in two, 
with one of the server pairs in each, are not considered. 

No large messages. Bulk data transfers and the like will 
probably overflow buffers or queues. 

Soft real-time responses. In the normal case (i.e., without 
any malfunctioning server) it is possible to guarantee 
approximately the same response times as for systems 
utilizing non-replicated servers. However, longer 
response times must be accepted at the time of failure, 
recovery and reintegration. These longer response 
times can still be guaranteed not to exceed a predeter- 
mined maximum amount of time. 

Deterministic operation of servers. As will be described in 
greater detail below, the backup server will receive 
periodic update messages from the primary server. The 
processing of these update messages in the backup 
server must be deterministic in order to guarantee that 
it will reach the same internal state as that of the 
primary when sending the update message. The server 
software cannot include non-deterministic system calls, 
such as calls to a time-of-day clock (which returns a 
different result, depending on when it is called), 
because such calls would cause the backup server to 
reach an internal state that differs from that of the 
primary server. 

Thus, the state of the backup server must be 100% 
specified by the information that it receives from the 
primary server. This can be achieved in either of two 
ways: 

a) the requests supplied to the primary server are also 
transferred to the backup server, which then reaches 
the same state as the primary server by doing iden- 
tical processing of the request; or 

b) the results of processing (i.e., the reply to the client 
that generated by the primary server, as well as the 
changes in the server's internal state) are sent to the 
backup server. 

Simple applications only. In the description of the inven- 
tive protocol set forth below, the replicated server 
cannot request services from other servers. The proto- 
col would have to be extended in order to handle such 
a case. In one such extension, the second server would 
then detect that a request comes from a replicated 
server and follow the same (or similar) protocol. 
Earlier, four problems that need to be solved were men- 
tioned. An inventive solution to one of these, namely rep- 
lication and synchronization, will now be described. In a 
preferred embodiment, replication and synchronization are 
implemented as part of the communication protocol that is 
used between the client and the server. Advantages of this 
approach are: 
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The implementation is done only once, when the protocol running in the client processor. Counterpart protocol stacks 

is designed. 215, 215' also run in the primary and backup server 

The replication is hidden from the application. The pro- processors, PROl and PR02. Requests 201 are sent from ithe 

tocol handles addressing of the replicated servers. client application C to the primary server S Ml. The 

The inventive protocol is designed for efficient imple- 5 protocol stack 215 of the primary server S 101 attaches a 

mentation of the desired replication and synchronization: sequence number to the request and then presses he 

, r . request. As a result of processing the request, the primary 

1) Two alternative implementations are possible: server s 101 gerier ates and sends a reply message 203, via 

a) The implementation may be an extension to the me protoco i sitc k 215, to the client application C immedi- 
communication method. This means that there would atc i y> i n accordance with one aspect of the invention, the 
be no extra system calls for processing a request 10 server's protocol stack 215 performs the additional function 
from a client in the primary server. 0 f storing the incoming request 201 in a queue whose 

b) As an alternative, the protocol may be integrated into contents are periodically communicated, via backup path 
the protocol stack. This makes it possible to make 209, to the protocol stack 215' of the backup server S' 107. 
more efficient implementations. In accordance with another aspect of the invention, the reply 

So-called "middleware" solutions, in which fault tol- 15 message 203 to the client C also includes information 

erance is implemented by a layer of software on top indicating at what point in a sequence of incoming requests 

of an existing operating system, would benefit from (since the last flush) the client's request 201 was processed 

the first alternative (i.e., alternative "a") but not from (i-e., the sequence number). 

the second (i.e., alternative "b"). when cIienl application's protocol stack 205 receives 

« x ,• 4 - i_ * u , - ( i 20 the reply message 203, it does two things: 1) it passes the 

2) The replication between servers can be outside the , »A, t ,. 4 ^ j^U-. j 
' . y ... lt „, r 4 t , e 4 reply message 203 to the client application C, and 2) it sends 

real-tune critical loop. The client can get a reply as fast r J ° n _ . # . • p i .u • • i 

iL . j a message 207 that may contain, for example, the original 

as the primary server S 101 can respond. { » weU ^ ^ ^ tQ b ^ ^ 

3) The extra information needed for keeping redundancy 2l5 , whicfa s [{ tQ tfae backup servef §t m , n some 
is attached to the reply in order to minimize overhead. 2$ embod iments, the backup scrvefs protocol stack 2 l 5 ' may 

4) Updates/Heartbeats to the backup server S' 107 are send an acknowledge message 211 to the client's protocol 
done periodically in order to minimize overhead and to slack 2 05, thereby confirming receipt of the client's mes- 
make it possible to guarantee that the recovery time sa g e 

after a fault will not exceed a predefined maximum. In addition to the backup server's receiving information 

The number of requests that can be processed by the 30 from tne client application's protocol stack 205, whenever 

primary server but not by the backup server will be me queue in the pr i mary server's protocol stack 215 reaches 

limited to the number of requests that can arrive a predetermined value, or alternatively when a predeter- 

between two periodic updates. mmec j arrJO unt of time has elapsed, the queue in the primary 

5) The replication can be supported within an I/O pro- server's protocol stack 215 is flushed to the backup server S' 
cessor giving no overhead at all on the main processor. 35 107 via backup path 209. In addition to supplying the vital 

The protocol guarantees that processed requests as well as redundant information to the backup server S' 107, the act of 

information about the order in which the requests are flushing also serves as a heartbeat, indicating to the backup 

processed, are always kept in two independent places in two s' 107 that the primary server S 101 is still alive. The time 

separate computers. This strategy is based on two observa- between flushes/heartbeats sets the maximum time for 

tions: 40 recovery when there is a fault. 

1) Redundant copies of the primary server state may be The backup server S f 107 takes over execution when it 
* established at a later time than is conventionally fails to receive one or more heartbeats from the primary 

performed, while still maintaining fault tolerance. That server S 101 and starts receiving requests from clients C. 

is, in conventional systems, server state information is The information that should be passed on to the backup 

transferred from the primary server to the backup 45 server in order to guarantee that recovery is possible is: a) 

server prior to sending a reply to the client. However, the original request, and b) the sequence number that was 

the invention recognizes that this is a conservative appended in the reply message. With this information, the 

approach, because prior to sending a reply to the client, back-up will (after a crash) be able to sort the requests to be 

no other processor has seen the result. Consequently, a in the same order in which they were processed by the 

primary server crash would be considered to have 50 primary server and then perform identical processing. The 

occurred before the processing of the request. This is same information may be passed to the backup server S* 107 

the case up to the time when the client receives the from both the client application's protocol stack 205 and the 

reply. This, then, is the latest possible time for estab- primary server's protocol stack 215, although in the case of 

lishing the existence of a redundant copy of the server information coming from the primary server's protocol stack 

state in order to have fault tolerance. 55 215, the sequence number is of less importance because the 

2) There are three independent parties involved: the client copy of the incoming requests may typically be passed on in 
application C requesting a service, the primary server S the order in which they were processed. 

101, and the backup server S l 107. At any time it is Passing the entire primary server reply message 

sufficient that critical information be maintained in two (including the sequence number) to the backup makes it 

redundant copies. However, these copies need not be 60 possible for the backup server to improve fault detection. In 

maintained only by the primary server S 101 and the addition to using the sequence number for sorting out 

backup server S 1 107 (as in a conventional two-phase message order, the backup server S 1 107 can then also verify 

commit protocol). Rather, the client can also be used that it is in synchronization with the primary server by 

for (temporarily) holding information. comparing its own reply to the one from the primary server 

For a simple server application, the replication is based on 65 S 101. It should be noted, however, that it is sufficient to pass 

a message flow as illustrated in FIG. 2. A client application, on a substitute for this information, such as a checksum of 

C, accesses a primary server 101 via a protocol stack 205 the reply, for this purpose as well. 
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For the fault detection purpose, the full reply information 
can be passed on from either source (i.e., via the client C or 
via periodic updates from the primary server S 101) or from 
both. In one embodiment, the full reply information is 
passed only via the periodic updates from the primary 
server's protocol stack 215 in order to minimize the amount 
of information that has to go the longer path via the client's 
protocol stack 205. 

There are also several alternatives to appending the 
sequence information to reply messages in the text. One 
alternative is to just append the sequence number the request 
was processed in. Another alternative is to include the entire 
request sequence since the last periodic update. These alter- 
natives serve the same purpose, and each can be regarded as 
"server state information" because they each define the order 
of the actions that the backup server S 1 107 must take in 
order to achieve an identical state as that of the primary 
server S 101. 

A number of fault cases, and how the invention handles 
them, will now be described: 

Primary Server Crash Before Reply is Sent 
In this case, the client C will not receive an acknowledge 
(i.e., reply message 203) from the primary server S 101. In 
response, the protocol stack 205 of the client C re-transmits 
the original request 201 to both the primary and secondary 
servers S 101, S' 107. Otherwise (i.e., in non-fault cases), the 
client application C sends the requests only to the primary 
server S 101. (It should be noted that the client application 
is generally unaware of this fault tolerance-related activity, 
since it addresses only the single logical server. Address 
translation and communication to the two servers, S 101 and 
S* 107, are handled by the protocol stack 205 within the 
client processor.) If the secondary server S' 107 misses the 
heartbeats from the primary server S 101, it takes over. 
Otherwise, it simply discards the request received from the 
client C. 

Primary Server Crash after Sending a Reply but before 
Information is Rushed to Backup 

The information needed for updating the backup server S' 
107 to the state that existed when the last reply was sent can 
be retrieved from update messages supplied by the client's 
protocol stack 205. Messages in the "reply path" from the 
primary server S 101 to the client C contain both the reply 
to the client application as well as the update information to 
the backup server S' 107. The client application need receive 
only the reply information from the client C, not the addi- 
tional update information. As shown in FIG. 2, the update 
information is forwarded from the client's protocol stack 
205 to the backup server S' 107 (via the backup server's 
protocol stack 215'). This update information is the same 
information that the backup server S* 107 otherwise receives 
by means of the periodic updates that are directly commu- 
nicated by the primary server S 101. The cost of adding 
some extra information in an already existing message is 
small compared to having to send an extra message for it. 

Client Crash after Sending Initial Request 

In this case, the backup server S' 107 receives information 
for updating itself when the primary server flushes its queue. 

Primary System Crash 

The primary server S 101 as well as any clients executing 
in the same processor 103 will be lost. The backup server S' 
107 executes remaining commands from the last flushed 
queue and then gets updated up to the point given by the last 
reply to a client that is executing outside of the primary 
server's processor 103. 

Message Loss 

Messages that do not get an immediate acknowledgment 
are re-transmitted once or twice before the receiving process 
(or processor) is considered to be faulty. 
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The client's protocol stack 205 will now be described in 
greater detail with reference to FIG. 3. At step 301, client 
application execution causes a request to be sent to the 
primary server. At step 302, the request is processed in the 

5 protocol stack 205 and sent to the primary server. The 
protocol implements re-transmission at message loss and a 
copy of the message is kept for doing this. At step 303, a 
reply has returned from the primary server. The reply is sent 
back to the client application processes without delay. A 

10 copy of the request and the associated reply are kept for the 
replication protocol. Because, in this example, the primary 
server is presumed to respond relatively quickly, there is no 
separate acknowledgment sent from the primary server to 
the client. That is, the reply that is returned from the primary 

15 server is sufficient to function as a reply. In other embodi- 
ments that include a relatively slow primary server, it may 
be necessary for the protocol to include a separate acknowl- 
edgment that would be sent from the primary server to the 
client prior to transmission of the reply. 

20 At step 304, the application process can resume execution 
without waiting for the replication to be performed. At step 
305, the protocol stack 205 stores the request as well as the 
reply in a queue that is designated for requests that are not 
yet replicated to the backup server. 

25 At step 306, the client sends a message containing the 
original request as well as the reply to the backup server. In 
response, the backup server returns an acknowledgment 
(step 307) to the client, in order to confirm safe receipt of the 
client's message. It will be noted that without the 

30 acknowledgment, the client would have no other way of 
knowing that its message had been received because no 
other reply is expected from the backup server. 

Earlier, several other problems were mentioned, namely 
Fault and Repair Notifications and State Transfer. The 

35 inventive solutions to these problems will now be described. 
With respect to Fault and Repair Notification, the com- 
munication between the primary and secondary server also 
functions as a heartbeat. If the secondary server does not get 
updated regularly, it waits long enough to receive any 

40 outstanding client timeouts and then takes over. When a 
server process restarts, it checks whether there is an active 
primary server. 

Regarding State Transfer, this is used at the time of 
restarting a failed server. The state of the executing server 

45 must then be copied to the restarted one before they, again, 
can work as a primary/backup pair. There is no fundamental 
difference between this state transfer and the type of state 
transfer needed when doing system software and hardware 
upgrades. Also, given the low number of hardware failures 

50 in modem processors, the state transfer mechanisms should 
be optimized for system upgrades. 

It will be recalled that one aspect of the invention is a 
requirement that requests from different clients, as well as 
fault and repair notifications, must be sorted in the same 

55 order, even though they may arrive in different orders in the 
primary and backup servers S 101 and S' 107. Thus, in some 
embodiments it may be beneficial to provide a mechanism 
for enforcing causal dependency (also referred to herein as 
"causal ordering") between messages. Essentially, this refers 

60 to the processing of messages in the order in which they had 
been logically issued, rather than in the strict order in which 
they may have been received. A more complete description 
of causal ordering may be found in connection with a 
description of the ISIS tool-kit, which was developed by 

65 Cornell University in Ithaca, New York, USA. The descrip- 
tion may be found in K. P. Birman and R. van Renesse, 
"Reliable Distributed Computing with the ISIS toolkit," 
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published 1994 by IEEE COMPUTER SOCIETY PRESS, means, responsive to the request, for sending the 

ISBN 0-81865342-6. Causal ordering can be implemented response to the client, independent of any backup 

with low overhead and can improve system efficiency by processing, wherein the response includes the 

allowing a higher degree of concurrence. FIGS. 4a and 4b primary server state information; 

illustrate this efficiency improvement. In FIG. 4a t a proces- 5 means for performing backup processing that 

sor Prol sends a request for a resource to a resource handler includes periodically sending the primary server 

Pro2 (step 401). Without support for causal ordering in the state information to the backup server; and 

underlying system, Pro2 must send a message to the the backup server comprises: 

resource Pro3 to initialize it (step 402). After the resource means for reoeiviD the primary server state ^for- 

has replied that it is ready (step 403), Pro2 is now permitted 10 matk)D £rom ^ ^ server; 

to send a reply to Prol, informing it that the resource is ^ for fc d ^ ^ ^ 

available (step 404). The processor Prol can now send a c . % / 

. „ - y . * i mation from the client, 

message to the resource Pro3 (step 405). It will be observed „ ^ T , " " t r , ■ 

*\Z u^,„;.,.f fl ^ Jocn.j^nnotr^n^ k„ 2. The fault-tolerant client-server system of claim 1, 

that the behavior of each processor is constrained by restnc- , . . . t 4 . - J . . . , ' 

, . , „ / c • . / a + r wherein the pnmary server state information includes all 

ttons designed to prevent one processor from receiving (and 15 t j / irs tnat the ^ ^ has handlcd sincc 

consequently processing) a later-sent message prior to an a ^ rec F en 7 transmission F of primary server state informa . 

earher-sent message. tion from the primary server to the backup server. 

Referring now to FIG. 46, this illustrates an example in 3 ^ f au it. to l e rant client-server system of claim 1, 

which the underlying system supports causal ordering. wherein the rf slate i n f ormation includes a 

Again, the example begins with the processor Prol sending 20 checksum derived from a reply. 

a request for a resource to a resource handler Pro2 (step 4 ^ fau i t . to i erant client-server system of claim 1, 

406). Now, the resource handler Pro2 does not need to wait wn erein the primary server's means for performing backup 

for a reply from Pro3. Instead, it immediately sends a reply processin g ^ activated periodically based on a predeter- 

to Prol informing that the resource is available (step 407). mined time interval 

At approximately the same time, Pro2 sends a message to 25 5 ^ fault _ lo i eran t client-server system of claim 1, 

the resource Pro3 to initialize it (step 408). Because of this wherein* 

concurrence, the processor Prol is able to send its message j-^. -ij r. -.L 
to the resource Pro3 (step 409) much sooner than in the the server ^her includes means for stonng the 
example without causal ordering (FIG. 4a). This does not P nmarv state *fo™at">n; and 
create any problems because the causal message ordering the primary server *s means for performing backup pro- 
guarantees that Pro3 will process the initialization message 30 cessing is activated in response to the means for storing 
before receiving the message from Prol, even if the message the primary server state information being filled to a 
from Pro2 gets delayed (alternative step 408'). predetermined amount. 

It is not necessary to implement a full causal ordering 6. A method of operating a fault-tolerant client-server 

model for the limited case in which clients call a replicated system that comprises a primary server, a backup server and 

server, because in such cases the sequence number is suf- 35 a client, the method comprising the steps of: 

ficient to enable the replicated server to process requests in sending a request from the client to the primary server; 

the proper order. However, the full model is called for when . lL . . , tU 

the protocol is extended to a more general case, such as to m the Pnmary server, receiving and processing the 

allow a replicated server to call another replicated server. request including sending a response to the client 

The invention has been described with reference to a 4 o independent of any backup processing being performed 

particular embodiment. However, it will be readily apparent b y the Penary server, wherein the response includes 

to those skilled in the art that it is possible to embody the primary server state information; 

invention in specific forms other than those of the preferred performing backup processing in the primary server, 

embodiment described above. This may be done without including periodically sending the primary server state 

departing from the spirit of the invention. The preferred 45 information to the backup server; 

embodiment is merely illustrative and should not be con- in ^ cU rece iving the response from the primary 

sidered restrictive in any way. The scope of the invention is server* and 

given by the appended claims, rather than the preceding ' . 

description, and all variations and equivalents which fall sending the pnmary server state information from the 

within the range of the claims are intended to be embraced client to the backup processor, 

therein. 50 The method of claim 6, wherein the primary server state 

What is claimed is: information includes all request-reply pairs that the primary 

1. A fault-tolerant client-server system, comprising: server has handled since a most recent transmission of 

0 nr mon, c *™>r. primary server state information from the primary server to 

a primary server. . . . 

f[ . , lh e backup server, 

a backup server; and 55 8. The method of claim 6, wherein the primary server state 

a client, information includes a checksum derived from a reply, 

wherein: 9. The method of claim 6, wherein the step of performing 

the client comprises: backup processing in the primary server is performed peri- 

means for sending a request to the primary server; odically based on a predetermined time interval, 

means for receiving a response from the primary 60 10 Th* method of claim 6, wherein: 

server, the primary server further performs the step of storing the 

wherein the response includes primary server state infor- primary server state information in storage means; and 

mation; the step of performing backup processing in the primary 

means for sending the primary server state informa- server is performed in response to the storage means 

tion to the backup server; 65 being filled to a predetermined amount, 
the primary server comprises: 

means for receiving and processing the request; ***** 
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[57] ABSTRACT 

A method and system provide for selectively distributing 
communications between an application and multiple 
servers, allowing cooperative use of a single copy of an 
application. The system is situated between an application 
and the multiple servers. Requests from the application, 
responses to the requests, and events from the multiple 
servers, are managed in such a way that each server believes 
it is connected directly to the application and the application 
believes it is connected directly to a single server. The 
requests are categorized and distributed to the servers based 
on the type of request. The responses to these requests may 
be sent to the application or discarded based on the type of 
request and the role of the server sending the request. The 
events are also categorized and, based on the role of the 
server causing the event, they may be passed on to the 
application or discarded. 

10 Claims, 3 Drawing Sheets 
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SYSTEM FOR CLASSIFYING AND SENDING 
SELECTIVE REQUESTS TO DIFFERENT 
PARTICIPANTS OF A COLLABORATIVE 
APPLICATION THEREBY ALLOWING 
CONCURRENT EXECUTION OF 
COLLABORATIVE AND NON- 
COLLABORATIVE APPLICATIONS 

CROSS-REFERENCE TO RELATED 
APPLICATIONS 

The present application is related to U.S. patent applica- 
tion Ser. No. 08/387,500, entitled Method and System For 
Switching Between Users In A Conference Enabled Appli- 
cation now U.S. Pat. No. 5,557,725, U.S. patent application 
Ser. No. 08/387,502, entitled Method for Managing Top- 
Level Windows Within a Conferencing Network System, 
U.S. patent application Ser. No. 08/387,503, entitled Method 
For Managing Visual Type Compatibility In A Conferencing 
Network System Having Heterogeneous Hardware now 
U.S. Pat. No. 5,715,392, U.S. patent application Ser. No. 
08/387,504, entitled Method To Support Applications That 
Allocate Shareable Or Non-Shareable Colorcells In A Con- 
ferencing Network System Having A Heterogeneous Hard- 
ware Environment, U.S. patent application Ser. No. 08/387, 
505, entitled Method For Managing Pixel Selection In A 
Network Conferencing System, U.S. patent application Ser. 
No. 08/387,506, entitled Method And Apparatus For Trans- 
lating Key Codes Between Servers Over A Conference 
Networking System now U.S. Pat. No. 5,640,540, all filed of 
even date herewith by . the inventors hereof and assigned to 
the assignee herein, and incorporated by reference herein. 

BACKGROUND OF THE INVENTION 

1. Technical Field 

The present invention relates in general to the field of data 
processing systems and in particular to the field of multiple 
user systems. Still more particularly, the present invention 
relates to the field of enabling multiple users to simulta- 
neously use a single user application. 

2. Description of the Related Art 

The need to communicate as a group when the partici- 
pants are not in the same room is becoming increasingly 
common. Past solutions include the use of faxes, telecon- 
ferencing and video conferencing. However, there have been 
few solutions for groups wanting to interact through a 
computer application. The participants can travel and meet 
in a single physical location, but the expense is often 
prohibitive. The participants could use a file sharing 
arrangement, but they would only be able to see their own 
session, not what the other participants are doing. Another 
approach is to allow everyone to see a view of one person's 
screen, but allow only the person with the application to 
interact with the application. Rewriting existing software to 
function in a multiuser mode is rarely a feasible solution. 

The use of a conference has been proposed, and allows all 
participants to see the same working session. The use of this 
type of system allows pre-existing applications written for a 
single-user environment to be used from within the frame- 
work of a multi-user conference. A conferencing enabling 
module is located between the application and the users, and 
controls access to the application. These types of systems 
allow pre-existing applications to be used in a conference 
without the need for modifying the application. With this 
approach the problems which arise relate to handling input 
from multiple users. 

The pre-existing applications are written with the assump- 
tion that they will be used by a single user. This assumption 
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can cause some problems in a conferencing system because 
it leads to other assumptions. If there is only one user then 
the hardware of that user is not likely to change, the 
application is receiving only one stream of input, and the 

5 user wants communication from the application. These 
assumptions do not always hold true in a conferencing 
environment. The users in the conference may have different 
hardware, input may be sent from several workstations, and 
some users may be working on other applications, either in 

10 the conference or locally, and do not want communication 
from the application. The conferencing enabler must be able 
to handle outputs from and inputs to the applications and 
workstations so the application being conferenced is pro- 
tected by giving the application the appearance of a single 

15 user environment. 

U.S. Pat. No. 5,195,086, issued to AT&T Bell 
Laboratories, discloses a communication conferencing 
application which controls multiple concurrent calls sharing 
applications. This is effectuated by pseudo servers which 

20 control the flow of events from the servers to the application 
and necessary X resource identifier translation. This imple- 
mentation only allows one party to input at a time. The 
method for controlling the flow of events for the X resource 
identifier translation is mentioned but not disclosed. 

25 A system has also been disclosed by R. Wiss in the X 
Windows/MOTIF User Interface Server document. In this 
disclosure a system and method are described for the 
dynamic sharing of user interfaces which are coupled to 
applications, and a window management system provides 

30 concurrent event handling for multiple applications. This 
system and method controls the events from user interfaces 
of multiple applications. 

U.S. Pat. No. 5,293,619, entitled "Method and Apparatus 

35 for Collaborative Use of Application Program", has a similar 
approach to the current invention. It discloses a single 
system, between an application and multiple X servers, 
which passes output requests from the application to the 
servers and passes events from the servers to the application. 

40 One problem with the approach of the '619 patent is that, 
by passing requests to all servers, users cannot use applica- 
tions other than the currently conferenced application. An 
example of the problem is when the conferenced application 
issues a request to freeze the keyboard while it is doing some 

45 type of activity. If this request is sent to all users, their 
keyboards will be frozen until the conferenced application 
has issued the unfreeze keyboard request, thereby inhibiting 
a user from doing any activity outside the conferenced 
application, such as checking E-mail. 

50 Another problem with system described in the '619 patent 
is that by passing all events from the servers to the 
application, the program can get confused unless the users 
coordinate themselves well. An example of this problem 
occurs when two engineers are working on a CAD drawing 

55 of a widget and one engineer wants to move a part of the 
widget while the other engineer wants to delete the same part 
of the widget. Unless the two engineers coordinate their 
activities over the phone, the command to delete the part 
could be entered before the other engineer attempts to move 

60 the part and the program would try to move a nonexistent 
part. Unless there is sufficient error handling already in the 
application, the application could crash. 

The usefulness of the *619 patent is also limited because 
it assumes the minimum common hardware for the servers. 

65 If all servers but one had a mouse, the conferencing system 
wouldn't be able to recognize mouse capabilities. If the user 
without the mouse just wanted to observe and not input, the 
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conference is still limited. This limits the capabilities of the tation of the flow of information in the preferred system. The 
system. This limitation could be even more of a problem if present invention is part of the conferencing enabler 102. 
the hardware capabilities are mutually exclusive. The The conference enabler 102 comprises software that con- 
requirement that the workstations and applications be pre- ceptually resides between an X Windows application and the 
determined before invoking the conference, and can't be 5 X server. A conference is the shared use of an application by 
changed during the conference, limits the usefulness of the multiple users each having the same view of the program, 
system of the '619 patent. If during the conference it is ™ e conference exists in an X Windows environment with 
determined that different users or applications are either muUi P le * serve ^ **** m the a ? W ' U re 2? uael1 ? 
necessary or unnecessary, the conference must be closed and variet y of Detworks and workstations which could be used, 
restarted with the new configuration. 10 ^ conferencing enabler 102 is a program which runs 

Therefore, it will be apparent that a need exists for an continuously as a demon process in the back^ound. It is 

improved method and system whereby a standardized sys- conceptually situated between an application 100 and an X 

tern allows a single application to be distributed to multiple * erver * J te P h y sical ocatlOD 15 relevant, as it may actually 

users in a conference to use concurrently without modifica- be 0D me same workstation as the application, or on the X 

tion to the application 15 server » or al otner locallon * 11 ^ onJv required that both 

the X servers in the conference and the application have 

SUMMARY OF THE INVENTION network access to the conferencing enabler. The conferenc- 

It is therefore one object of the present invention to ing enabler 102 appears to the application 100 to be an X 

provide a system which allows multiple servers to use a server, while at the same time appearing to an X server to be 

single copy of an application. 20 an application. The conferencing enabler 102 then connects 

It is another object of the present invention to allow each to multiple X servers on behalf of the application. Each 

participant in the conference to interact with the application. participant in the conference may interact with the distrib- 

It is yet another object of the present invention to distrib- uted application. The application 100 does not know that it 

ute application requests to one or more participants' server » being distributed to multiple X servers. The conferencing 

and manage the responses. 25 enabler 102 determines how to multiplex and de-multiplex 

It is yet another object of the present invention to control the requests from the application 100 and the replies ;> events 

events from each of the servers in the conference and decide and *™ ™ m m such a wa ? that ^ 

which to send to the application. iSSSn "* 

The foregoing and other objects are achieved as is now _ ' . , t + M . . c 

described. A method and system provide for selectively 30 The oootonoq enabler 102 receives requests 104 from 

. # . . i* *• j an application 100 and distributes them among the servers 

distributing communications between an application and F v, 7 « 2 7 . * . . f , . . . , 

multiple uiers, aUowing cooperative use of a single copy of 108 ;. 114 ' m * e conference 124 assorted with that 

an application. The system is situated between an apptica- 1«0 When a request is received the conferenc- 

tion and the multiple servers running the application. "W eaab }" 102 «ust determine which X servers must 

Requests from an application, and the responses to them, 35 «ce.ve the request. The conferencing enabler 102 also 

r w; , eon Jl ™™„ tUo *; nf *u ™i.v a t;«n o^ri receives events and responses to requests 112, 120, 122 from 

from multiple servers running the single application, and / , 7 . ' . ' 

j. ... , 9 ~i the X servers 108. 114, 116 and determines which are sent 

events from multiple servers running the application, are "7. ^ ~T 1 , . 7. 7*„ zj BUU "^"""^ " 1 1 aiv ^ 

managed in such a way that each server believes it is 106 to *f application 100. The events from the X .serversare 

connected directly to the application and the application mana ^ ir \ such « wa y " to P rcsent a consistcnt in P Ut 

believes it is connected directly to a single server. The 40 stream for the application 100 

requests are categorized and distributed to the servers based The conference 124 begins by a user requesting a con- 
on the type of request. The responses to these requests may ference from the **™> 10 case X 108 Only 
be sent to the application or discarded based on the type of certain ™f rs ma / start a conference The X server which 
request and the role of the server sending the request. The 1D,tiates th / conference, in this case X server 108, is given 
events are also categorized and, based on the role of the « the role of master. The master controls the conference 124. 
server causing the event, they may be passed on to the 0nl y one X server can be the master in a conference and 
application or discarded. once me role 15 establlshed it is irrevocable. If the master 108 

leaves the conference 124, the conference 124 is closed and 

BRIEF DESCRIPTION OF THE DRAWINGS aU applications associated with that conference 124, in this 

The novel features believed characteristic of the invention 50 case 100, are terminated. The first X server 108 is also 

are set forth in the appended claims. The invention itself initially assigned the role of input focus. There can only be 

however, as well as a preferred mode of use, further objects one server assigned the role of input focus for each appli- 

and advantages thereof, will best be understood by reference cation 100 in the conference 124 at any one time, but this 

to the following detailed description of an illustrative role can be switched among the X servers 108, 114, 116 in 

embodiment when read in conjunction with the accompa- 55 the conference by predetermined triggers, 

nying drawings, wherein: For purposes of example, X server 114 is the input focus. 

FIG. 1 depicts a block diagram of a conference in accor- The input focus 114 is the only server allowed to input to the 

dance with a preferred embodiment of the present invention; application by key presses or button events. The X servers 

FIG. 2 depicts a logic flow illustrating event handling in of the users in the conference that are not assigned the role 

the preferred embodiment of FIG. 1; and 60 of master 108 or input focus 114 have no specific role. These 

FIG. 3 depicts a logic flow illustrating request handling in servers can dis P la y and manipulate the display of the appli- 

the preferred embodiment of FIG. 1. cations but the y caD DOt m P ut to ^ application without 

obtaining the input focus role. Because the application 

DETAILED DESCRIPTION OF A PREFERRED believes that it is connected to a single server, the role of 

EMBODIMENT 65 mas t e r 108 represents the hardware of the single server and 

With reference to the figures and in particular with the input focus 114 represents the display and input of the 

reference to FIG. 1, there is depicted a pictorial represen- single server. The aster 108 is not allowed to change because 
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the hardware of the X server attached to the application is wise it is discarded 224. The rationale for sending these type 

not expected to change while the application is running. of events to the application 100 only if they are from the 

Limiting the input focus to one X server at a time ensures master 108 in the conference 124 is because if this type of 

that the application has one view of the display and input event is received it means that two applications are trying to 

from one X server. Otherwise the application is required to 5 communicate. If the other application, the one sending the 

do something it wasn't designed to do. evenl » is ako P art of the conference then the event will only 

AC . J T v i/u rT t ^ , iL r Af%A be received on the master and it should be sent on to the 

After the first X server 108 has started the conference 124, appU cation. If the event is being sent from an application 

other users may request to join the conference 124 from their ^ ^ not part of the confereace> then it ^ on i y sen t if it is 

servers, X server 114 and X server 116. With each request a rcce ived on the master so as to continue the illusion that the 

small amount of code is place on the server regarding: ™ master's X server is the only one connected to the applica- 

members of the conference, who can join, who can launch ^ on 

an application, etc. These are typical user interface if the event 112, 120, 122 is received 200 is of the type 

functions, known in the art, and will not be further described which refiects a change ^ tne sUte m ^ cliem » s resources 

herein. After the conference 124 is established, members 2 32 and 228 it is from the input focus 114 for the application 

108, 114, 116 of the conference 124 can leave, except the * 5 m it ^ 230 sent to the application 100, otherwise it is 

master 108, and others can be added. Once the master 108 discarded 216. An example of this type of event is one which 

has started the conference 124, certain X servers can launch indicates the window has become mapped. The rationale for 

an application 100 for the conference. More than one sending (hese type of events t0 the application 10 o only if 

application 100 can be launched for a given conference 124 ^ are from the inpm focus U4 ^ because tnese type of 

with the same copy of the conferencing enabler 102. The 20 evente are most generated as a resu lt of a request from 

conference enabler 102 can distinguish the requests, me app ii cat i on 100 or from another client connected to the 

responses and events for the different applications. same x server and are most likely ^ in response t0 a user 

The invention takes the events 112, 120, 122 from the X interaction from the input focus 114. They are sent only from 

servers 108, 114, 116 and determines which category they the input focus 114 to ensure the appearance of a single X 

are a member of, and which X server they originated from, server inputting to the application 100. 

to determine whether they are sent to the application 100 or T f the evenl 112, 120, 122 received 200 is of the type 

discarded. FIG. 2 is a flow representation of the logic which re fl ects a change in the state of the X server 226 and 

involved in determining whether the events 112, 120, 122 it is from the master 108 in the conference 124, it is 230 sent 

are sent to the application 100. Referring to FIG, 2, when the 3o to the application 100, otherwise it is discarded 234. An 

conferencing enabler 102 receives 200 an event 112, 120, example of this type of event is one which indicates that the 

122 from an X server 108, 114, 116, the type of event and keyboard key mappings have changed. The rationale for 

the X server from which it originated determine whether the sending these type of events to the application 100 only if 

event is passed on to the application 100 from the X servers mey m f rom tne mas ter in the conference 124 is to ensure 

108, 114, 116. 35 me a ppHcation 100 is presented with a consistent X server. 

If the event 112, 120, 122 received 200 is of the type Because the application 100 believes it is connected to a 

which is due to a user interaction 202, such as a key press, single X server, the application 100 could become confused 

if (204) it is from the input focus 114 it is sent 206 to the if state changes were forwarded from other servers, 

application 100. If it is not from the input focus it is The conferencing enabler receives the requests 104 from 

discarded 216 unless 210 it is an allowable input focus 4Q th c application 100 and determines which type they are to 

switch, in which case the participant sending the event is determine which X servers receive the requests 104 and 

now the input focus 212 and 206 the event is sent to the which responses to these requests to send back to the 

application 100. The rationale for sending these type of application 100. FIG. 3 is a representation of the logic 

events to the application 100 is that the input focus 114 is the involved in determining which X server 108, 114, 116 the 

only X server allowed to supply input to the application 100. 45 requests 104 from the application 100 are sent to. When the 

This ensures that the application 100 does not receive an conferencing enabler 102 receives 300 a request 104 from an 

inconsistent sequence of events originating from multiple application 100, the type of request 104 determines which X 

participants. The exception exists when the focus is to be servers 108, 114, 116 the request is sent to and which replies 

changed. to the request 104 are to be received by the application 100 

If the event 112, 120, 122 received 200 is of the type 50 from the X servers 108, 114, 116. 

which causes an application 100 to redraw itself 208 and 214 If the request 104 is of the type which is to draw geometry 

it is from an X server in the conference 124 it is 218 sent to or change the state of a resources 302 then 304 the request 

the application 100, otherwise it is discarded 216. An 104 is sent to all X servers 108, 114, 116. An example of this 

example of this type of event occurs when a window which type of request is one which asks to draw a line. The 

was previously covered becomes exposed. The rationale for 55 rationale for sending these type of requests to all of the X 

sending these type of events to the application 100 if they are servers is so each X server maintains nearly identical states, 

from any X server in the conference 124 is to ensure that the It is important that everyone is able to view the same image, 

application 100 is displayed correctly on every X server. There are no replies to these type of requests, but an X server 

Since participants can interact independently with the con- may respond with an error. If an error is received from an X 

ferenced application, taking these type of events only from 60 server, generally they are discarded 306 unless they are from 

one X server will not ensure the application is correctly the master 108. The reason only the errors from the master 

displayed on every X server. If a user is working on a 108 are passed back to the application 100 is because the 

different application, their view of the conferenced applica- errors will usually be due to something the application did 

tion will be updated, but they will be otherwise unaffected. wrong, so it is sufficient to receive the error from one X 

If the event 112, 120, 122 of any type is received 200 from 65 server and for consistency the master 108 is chosen, 

another application and it is 222 from the master 108 in the If the request 104 is of the type which queries the state of 

conference 124 it is 218 sent to the application 100, other- the display 308 of an X server, then 310 the request 104 is 
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sent to the input focus 114. An example of this type of 
request is one which asks about the current size or contents 
of a window. The rationale for sending these type of requests 
only to the input focus 114 is that each X server can have 
significantly different views because they are allowed to 5 
minimize, cover or even close windows they are not spe- 
cifically interested in. The input focus 114 should have a 
view which matches the application's perception of the state 
of the display, so it should be only the X server 114 which 
is allowed to provide input to the application 100 which 
supplies the state of the display. It is important that the 
application 100 believe it is attached to only one X server 
with only one display. The replies to these type of requests 
are received 312 only from the input focus 114 for the same 
reasons they are only sent to the input focus 114. 

If the request 104 is of the type which queries the state of 
the X server 314, then 316 the request 104 is sent to the 
master 108. An example of this type of request is one which 
asks about the fonts available on the server. The rationale for 
sending these type of requests only to the master 108 is 2Q 
because it is important that the application 100 believe it is 
attached to only one X server 108, and the state of that X 
server 108 is not expected to change during the running of 
the application. The replies to these type of requests are 
received 318 only from the master 108 for the same reasons 25 
they are only sent to the master 108. 

If the request 104 is of the type which queries or changes 
the state of input devices 320 of an X server, then 322 the 
request 104 is sent to the input focus 114. The rationale for 
sending these type of requests only to the input focus 114 is 30 
because each X server can have significantly different views. 
Because these requests may change the state of input devices 
the input focus 114 should be forced to match the applica- 
tion's 100 view of input devices while the other X servers 
need not be. An example of this type of request is a request 35 
which changes the location of the pointer. The input focus 
114 should have the correct pointer location while other X 
server users may be working on a different application and 
their pointer should not be changed. It is important that the 
application 100 believe it is attached to only one X server 40 
with only one display. The replies to these type of requests 
are received 324 only from the input focus 114 for the same 
reasons they are only sent to the input focus 114. 

If the request 104 is of the type which initializes resource 
identifiers 326 then 328 the request 104 is sent to all X 45 
servers 108, 114, 116. An example of this type of request is 
one which asks the server to create an atom. The rationale 
for sending these type of requests to all of the X servers is 
that the request creates a resource to be used in the future, 
so that resource must be created on all of the X servers in 50 
order for the subsequent request to be successful. Some of 
these type of requests may generate either replies or errors. 
Replies are sent from the master, simply so that the appli- 
cation receives resource identifiers that are in accord with 
those that exist on the master. Errors are returned from the 55 
master because an error from this type of request usually 
indicates that the application has done something wrong to 
generate that error. For consistency, therefore, the error is 
returned if it comes from the master. 

Several conferences can utilize the conferencing enabler 60 
102 at the same time. The conferences can be made up of 
different combinations of X servers. Each conference 124 
has its own master 108 and each application within a 
conference 124 has its own input focus 114. Each conference 
124 can have more than one application running. 65 

A typical example of the use of this system and method is 
in a help desk situation. If a user is having a problem with 



an application they can conference with the helper. The user 
can show the helper where the problem is on the application 
and the helper can see the same view of the application the 
user is interacting with, without being in the same room 
looking at the display of the user's X server. The input focus 
can then be switched, allowing the helper to show the user 
how to fix the problem or the correct way to operate the 
application such that the user sees the helper's interaction 
with the application. During this session additional applica- 
tions or people can be added to solve the problem. The user 
and helper communicate over the telephone during the 
conference to explain what they are doing and to coordinate 
the switching of the input focus. 

Another example of the use of the invention is a team 
working on a project, each member of the team possibly 
having a workstation with a different hardware configuration 
than the others. The project leader begins the conference and 
the team joins the conference. While the engineers work 
together to design their widget, one of the engineers gets an 
E-mail message. She switches over to the E-mail to read her 
message while the other engineers continue designing and 
are unaffected. After their design is complete, one of the 
engineers brings up a simulation program to see how the 
new design will perform. All participants can see the simu- 
lation program. To interpret the data another engineer is 
added to the conference to see where problems exist in the 
new design as pointed out by the simulation program. Once 
the other engineers know what problems they need to fix, the 
simulation analyst leaves the conference and the conference 
drops the simulation program. The problem doesn't involve 
some of the engineers so they too may leave the conference. 

While the invention has been particularly shown and 
described with reference to a preferred embodiment, it will 
be understood by those skilled in the art that various changes 
in form and detail may be made therein without departing 
from the spirit and scope of the invention. 

What is claimed is: 

1. A management system for executing collaborative and 
non-collaborative applications within a distributed comput- 
ing environment, said system comprising; 

at least one collaborative application; 

at least one non-collaborative application; 

a plurality of participants, wherein one participant is 
assigned the role of master, for controlling the system, 
and one participant is assigned the role of input focus, 
for inputting data to said at least one collaborative 
application; 

an interface which facilitates communication between 
each of said plurality of participants and said at least 
one collaborative application; and 

means within said interface for selectively classifying and 
distributing communication requests from said at least 
one collaborative application to said plurality of par- 
ticipants based upon categories of the classified 
requests, and responses and events from said plurality 
of participants to said at least one collaborative appli- 
cation wherein selected requests from said collabora- 
tive application are sent to all of said plurality of 
participants; wherein selected requests from said at 
least one collaborative application which query the 
state of a participant not directly related to a display are 
sent only to said master participant; and wherein 
requests from said collaborative application which 
query the state of input devices are sent only to said 
input focus participant so as to permit the manipulation 
and execution of said at least one non-collaborative 
application by all other participants. 
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2. The management system of claim 1, wherein said 
classified request categories include: 

requests to draw or change the state of the operating 
system's resources, requests to query the state of the 
display, requests to query the state of a participant not 5 
directly related to the display, requests to query the 
state of input devices, and requests which initialize 
resource identifiers for said at least one collaborative 
application. 

3. The management system of claim 2, wherein: 10 
said requests from said at least one collaborative appli- 
cation which draw or change the state of the operating 
system's resources are sent to all of said plurality of 
participants; i5 

said requests from said at least one collaborative appli- 
cation which query the state of the display are sent only 
to said input focus participant; and 

said requests from said at least one collaborative appli- 
cation which initialize resource identifiers are sent to all 2 o 
of said plurality of participants. 

4. The management system of claim 3, wherein selected 
responses from said plurality of participants to requests are 
sent to said at least one collaborative application based on 
the category of request to which said response is responsive. 2 s 

5. The management system of claim 4, wherein: 
responses to requests to draw or change the state of said 

plurality of participants* resources are sent only from 
said master participant; 

responses to requests to query the state of the display are 30 
sent from said input focus participant; 

responses to requests which query the state of said plu- 
rality of participants, not directly related to the display, 
are only sent from said master participant; ^ 

responses to requests which query the state of input 
devices are sent only from said input focus participant; 
and 

responses to requests which initialize resource identifiers 
are sent only from said master participant. 40 

6. The management system of claim 1, wherein said 
means for selectively classifying comprises means for clas- 
sifying events, wherein events are classified into categories 
and selected events are forwarded to said at least one 
collaborative application based on the category of the event. 45 

7. The management system of claim 6, wherein said event 
categories comprise: 

events due to a user interaction, events which cause an 
application to redraw itself, events from another 
application, events which reflect a change in state of 50 
said plurality of participants, and events which reflect 
a change in state of said at least one collaborative 
application's resources. 

8. The management system of claim 7, wherein: 

said events due to a user interaction are forwarded only 55 
from said input focus participant; 

said events due to a user interaction which may trigger a 
participant switch are forwarded only from a non-input 
focus participant, and said input focus role is switched 
to said non-input focus participant sending the event; 
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said events which cause said at least one collaborative 

application to redraw itself are sent from all of said 

plurality of participants; 
said events from another application are sent only from 

said master participant; 
said events which reflect a change in the state of said 

plurality of participants are sent only from said master 

participant; and 
said events which reflect a change in the state of said at 

least one collaborative application's resources are sent 

only from said input focus participant. 

9. A method for management and classification of 
requests and their subsequent responses for a multiple 
participant system including a collaborative application and 
a non-collaborative application, comprising the steps of: 

sending requests asking a participant to draw, or change 
a state of a participant's resources, from said collabo- 
rative application to all participants; 

sending requests querying a state of a display from said 
collaborative application to an input focus participant, 
and sending any replies to such requests only from the 
input focus participant to said collaborative applica- 
tion; 

sending requests querying a state of a participant from 
said collaborative application to a master participant, 
and sending responses to such requests from the master 
participant to said collaborative application; 

sending requests querying or changing a state of input 
devices from said collaborative application to the input 
focus participant, so as to permit the manipulation and 
execution of said non-collaborative application by all 
other participants and sending responses to such 
requests from the input focus participant to said col- 
laborative application; and 

sending requests initializing resource identifiers for said 
collaborative application from said collaborative appli- 
cation to all participants, and sending responses to such 
requests from all participants to said collaborative 
application. 

10. The method of claim 9 further comprising the steps of: 
sending events, due to a user interaction, to the collabo- 
rative application if they are from an input focus 
participant; 

sending events, due to a user interaction, to said collabo- 
rative application if they are to switch the input focus 
to another participant in said conference; 

sending events, which cause the collaborative application 
to redraw itself, to said collaborative application if they 
are from a participant in the conference, 

sending events, from another application to said collabo- 
rative application if they are from a master participant; 

sending events, which reflect a change in the state of a 
participant, to said collaborative application if they are 
from the master participant; 

sending events, caused by a change in the state of a 
participant's resources, to said collaborative applica- 
tion if they are from the input focus participant. 

* * + ♦ * 
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